Analyzing Trends Of Employment Across Sectors in the United States

Tousif Khan

Datasets provided by Our World In Data

My final project is an expansion of my midterm submission, and in this final submission I went back and redid the graphs from my midterm, now using plotly. I like the very clean and interactive features of plotly graphs and will be using them throughout my final project.

While I am a Math major, I have a pretty keen interest in Computer Science (which I am minoring in), and so I am used to reading documentation (and have read the pandas documentation thoroughly), as well as using intuitive Python code which helps a lot when sorting through data.

My interest in labor studies are from a more personal perspective, as an enthusiast of political economy and labor organizing.

For this analysis, I will only focus on the data for the United States. But in a future analysis, observing global economic and industrial trends should provide interesting insights.

This first file shows the number of workers employed across agriculture, industry, and the service sectors over the years since 1840.

Visualizing this data shows us quite a lot. There is a clear explosion of the service industry since the 1950s and since the 1950s the service industry has been a clear dominant force in the US economy.

From my knowledge of history, I know that in the post-WWII era the United States began outsourcing manufacturing industry and exploiting the cheaper labor and resources found in the Global South. These conditions led to the workforce in the US largely taking up service work in lieu of industrial and manufacturing jobs.

This article briefly mentions that outsourcing and also discusses a second-wave of outsourcing happening in the US.


Now I will take a look at the trends of GDP across the agriculture, industry, and service sectors since 1840.

The service sector being over 80% of the US economy was a surprising figure. Also very surprising is how low agriculture took up in the US GDP, as I thought it would be higher with our corn and grain production.

This too shows the clear boom of the US service sector in the post-WWII world. It also shows the rapid fall of the agriculture sector in the US since the late 1800s. The United States Industrial Revolution occurred around then, which led to a boom in manufacturing, industry and services as agriculture became less dominant in the national economy.

On a note about the accuracy of the dataset I am using, it appears the data becomes more "wild", as in more frequent data points around 1910. The smoother lines before 1910 are likely due to the lack of data then, and so the data points from 1910 forwards represent a more accurate visual on the US economy.


Reflecting on this observation of the US economic trend over the past few decades makes me curious about specific occupations and industry sectors in the US too. I have acquired a database which provides this information and will expand upon my mid-semester project. The next several cells will analyze 2021 data on jobs and occupations in the United States, looking into their total employment per employed population and comparing wages.

I will be dropping irrelevant columns to make working with the dataframe easier.

One quick note about this database is that it conveniantly has several levels of groupings for jobs and occupations: 1) Major groups (major category of jobs) 2) Minor groups (a more specific category of jobs) 3) Broad groups 4) Detailed (the specific jobs themselves)

Retail salespeople, cashiers, and fast food workers are the top 3 most populated jobs in the US!

Taking a cursory glance at their wages, it seems the most common and populated jobs are also the ones that pay very poorly. I will look into the wage makeup of the top several detailed job categories to compare.

For now, let's look at the broad category of jobs and see which ones are the most populous.

In broad sections of jobs, manual laborers, retail/cashier, and fast food workers are the most populous. Looking at their mean wage (the "H_MEAN" column) they are also payed very poorly. Not to mention physically demanding.

Visualizing the major groupings of jobs

Noting below that the wage information is actually not in number format, but in string ("object").

Here, I will convert the wage information to be floating-point objects, so I can work with their data numerically, as opposed to working with them as strings.

Graphing the top 50 most populous occupations and noting their mean hourly wage. As I noted before, most jobs that are taken up by people are payed severly low. The color scale splits the hourly mean wage into an upper section (blue), a middle section (white), and a lower section (red) -> and it is clear the vast majority of the 50 most popular jobs are being paid lower than the average mean hourly wage, indicated by the predominance of red in the graph above.

Comparing the data from before with the median wage, and we see a similar makeup as before. There still are the predominance of poverty wages for the most populous jobs.

Focusing on the top 10 most populous jobs, for a clearer and closer look at the above information.

Graphing the broad category of occupations and we continue to see similar results. One thing worth noting is the outlier that Registered Nurses play in this graph, being paid fairly high compared while also being a relatively popular occupation

Now, I will look at the highest paid occupations and compare them to their popularity as an occupation.

So it seems (unsurprisingly), the highest paid occupations are medical professions, managers and executives.

This is perhaps the most useful graph of them all, which very clearly and plainly reveals the wage gap in the United States. This box-and-whisker plot reveals a very skewed distribution towards poverty wages. The vast majority of the working-class make low wages, while only a select couple occupations make decent and high wages.

It is astounding how the median wage is only $23.28/hr which is less than 1/4th of what the max wage could be.

One other thing to note is that this database only accounts income from occupations, and do not include the wealth that comes from owning property, stocks, assets and private property. Taking all of those into account drives the gap between the rich and the poor even larger.

Given more time and more data, I would have liked to chart the difference in wealth between the rich and the poor over decades, to see what trends there are and if it is true that over the years the rich have been getting richer and the poor getting poorer. However, for an accurate analysis of that would need to account not only the wealth generated by occupational income, but also from assets and private property.